THE ASSEMBLY LANGUAGE "MAGAZINE" VOL 1 NUMBER 4 December 1989 ## #### #### ####### ## ## ###### #### ## ## #### ## ## ## ## ## # ### ### ## ## ## ## ## ## ## ### ### ## # ####### ## ## ## ## ## ## ## ### ### #### ####### ##### ## #### ###### ### ### ## # ## # ## ## ## ## # ## ## ## ## ## ## ## ## # ## ## ## ## ## ## ## ## ## #### #### ####### ## ## ###### ####### #### #### ## ## ## #### ## ## ## #### ####### ## #### ### ## ## ## ## ## #### ## ## ## # ## ## ## #### ## ## ## ## ## ## ## ## # ## ## ## ## #### ## ## ## ## ## ## #### ## # ###### ## ### ## ### ## ## ###### ## ### ## # ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## # ####### ## ## ## ## ##### ###### ## ## ##### ####### Written by and for assembly language programmers. Table of Contents Editorial.......................................2 Policy and Guide Lines..........................3 Beginners' Corner...............................5 Structure, Speed and Size By Thomas J. Keller........................7 Editorial Rebuttal........................11 Accessing the Command Line Arguments By Thomas J. Keller.......................13 Original Vector Locator by Rick Engle.............................15 How to call DOS from within a TSR by David O'Riva...........................22 Environment Variable Processor by David O'Riva...........................26 Program Reviews................................35 Multi-Edit ver 4.00a .....................35 SHEZ......................................36 4DOS......................................36 Book Reviews...................................37 Assembly Language Quick Reference Reviewed by George A. Stanislav......37 GPFILT.ASM.....................................39 ;Page 1 Editorial It has been much too long since the last issue of the Magazine was published. Much of this time was due to the lack of submissions but there has been enough to assemble since early November. I hope that it will not be as long till the next one is ready for distribution. You can help make that possible by writing up and sending in an article. I'm trying out a new editor for this issue. That makes it four editors for 4 issues. There is a review of it in the review section. There is a continueing and probably insoluable problem in formatting the 'Magazine'. The readability of the text portions is enhanced with wider margins and is more easily bound with a wide left margin. The difficulty arises when source code is included. 80 columns is little enough in which to fit the code and comments, allowing nothing for margins. So this time we'll try a 5 space margin on the left for the text portion. Further offset should be done with your printer. A couple of quick notes here as I don't know where else to put them. For the assembly programmer the principle difference in writing for DOS4+ is that there is a possible disk structure using 32 bit FAT entries. This of course has no effect as long as you use only the DOS calls for disk access, but if you are going to do direct disk editing this must be checked for. The occasional ~ is for the use of my spelling checker. ;Page 2 Policy and Guide Lines The Assembly Language 'Magazine' is edited by Patrick and David O'Riva. We also operate the AsmLang and CFS BBS to distribute the 'Magazine' and to make available as much information as possible to the assembly language programmer. On FidoNet the address is 1:143/37. Address: 2726 Hostetter Rd San Jose, CA 95132 408-259-2223 Most Shareware mentioned is available on the AsmLang board if local sources cannot be found Name and address must be included with all articles and files. Executable file size and percent of assembly code (when available) should be included when a program is mentioned and is required from an author or publisher. Any article of interest to Assembly language programmers will be considered for inclusion. Quality of writing will not be a factor, but I reserve the right to try and correct spelling errors and minor mistakes in grammar, and to remove sections. Non-exclusive copyright must be given. No monetary compensation will be made. Outlines of projects that might be undertaken jointly are welcome. For example: One person who is capable with hardware needs support from a user friendly programmer and a math whiz. Advertisements as such are not acceptable. Authors and publishers wishing to contribute reviews of their own products will be considered and included as space and time permit. These must include executable file size, percent of assembly code and time comparisons. Your editor would like information on math libraries, and reviews of such. Articles must be submitted in pclone readable format or sent E-mail. Money: Your editor has none. Therefore no compensation can be made for articles included. Subscription fees obviously don't exist. Publication costs I expect to be nil (NUL). Small contributions will be accepted to support the BBS where back issues are available as well as files and programs mentioned in articles(if PD or Shareware ONLY). Shareware-- Many of the programs mentioned in the "Magazine" are Shareware. Most of the readers are prospective authors of programs that can be successfully marketed as Shareware. If you make significant use of these programs the author is entitled to his registration fee or donation. Please help Shareware to continue to ;Page 3 be a viable marketing method for all of us by urging everyone to register and by helping to distribute quality programs. ;Page 4 Beginners' Corner I finished up the last column by saying I would discuss more techniques this time. I have entirely forgotten what they were. So without dwelling on that we will just move on the means of getting your program ready to run. The two formats (.com and .exe) are very different and so will be discussed separately. COM Programs On Entry all of your segment registers are set to the same value, that of the start of the PSP. Your stack pointer is set to the top of the segment, and your instruction pointer is set to 100h. You need to make a generous estimate of the maximum amount of stack that your program can use (or count it exactly) Each level of Call uses 2 bytes (for the address of the next instruction). An INT uses 6 bytes. (2 for the IP, 2 for the CS, and 2 for the Flags). Each push of course uses 2. So if your subroutines can go 4 levels deep and contain 7 pushes (without intervening pops) and the deepest contains an INT21h, then you would need at least 28 bytes of stack. But stack space is cheap, and you might need to change things. So use a nice round number of 128 bytes. BIOS also uses YOUR stack in the earlier versions of DOS, and the guideline for that is at least 128 bytes. Result: 256 bytes is safe for a modest program. To implement this the following lines of code could be used at the start of the program: ~ org 100h jmp main defstack db 32 dup('stack ') stacktop label byte ;other data main: cli mov sp,offset stacktop sti ~ The db statement is 32 times the string of 8 characters totaling 256 bytes. It could just all well be db 256, but it is kind of nice when looking at it with a debugger to see the stack area and how much has been used all nicely labeled. The cli and sti aren't really necessary here because it is only one instruction, but you are dealing with the stack, and it's well to remember that. At the end of your program you need a label e.g. ~ progend label byte Then following your stack adjustment above: mov bx,offset progend mov cl,4 shr bx,cl inc bx ~ ;Page 5 These instructions change the offset value into a number of paragraphs (16 bytes) and to the end of the last paragraph. This is the total number of paragraphs that will be occupied by your program. Then it is necessary to inform DOS of this information: ~ mov ah,4ah int 21h ~ 4a is the DOS function to modify allocated memory. It needs the new number of paragraphs in BX (which is where it was put) At this point, your program is in an orderly condition. Your data as well as that in the PSP is available with the DS and ES registers, The stack is large enough and well mannered, and all surplus memory is available to you or other programs. ;Page 6 Structure, Speed and Size as Elements of Programming Style By Thomas J. Keller P.O. Box 14069 Santa Rosa, CA, 95402 Let us examine the reasons for choosing to implement a given program in assembly language as opposed to some high level language. The reasons most commonly given are execution speed and memory image size. Execution speed, except in certain highly critical realtime applications, or certain high resolution graphics applications, is probably not a realistic reason to opt for assembly language. For example, a good C compiler with optimization (which precludes use of Turbo or Quick C) produces code which only suffers a 10-15% speed penalty, over typical hand crafted assembly language code. It is possible to write assembly language code which will run faster than this, but few programmers have the requisite skills. In most applications, a 10-15% speed penalty is simply irrelevant. It is unlikely that the typical user would even notice such a difference. In particular, programs which are highly interactive, and thus spend far and away the greatest amount of time waiting for user input are highly insensitive to such speed penalties. Many people don't realize that even assuming that the user is typing at a rate of 100 wpm (approximately 500 keystrokes/minute), the CPU is still spending the bulk of its time idling, waiting for the next keystroke. There are, of course, always exceptions to virtually any rule, and there are most certainly exceptions to this rule. Word processors, for example, while actually accepting text input, are not speed critical. When performing global search and replace, or spell checking, for example, even a 10% penalty can become expensive on large documents. So there is a tradeoff to be made. Assembly language programs cost considerably more than 10-15% more to develop than high level programs. The minutiae involved in managing a massive assembly language programming effort are overwhelming. Assembly language programs take MUCH longer to complete, in almost all cases, than high level programs do, a major contributory factor in the overall cost of development. Finally, projects developed in high level programming languages are much more likely to be easily ported to platforms based on processors other than the platform on which the project is developed, and very important consideration for a major project. The ability to port a project easily to other platforms increases the market for a product, thereby not only increasing the profitability of the product, but also helping to reduce the sale price of the product (larger market generally translates to lower per unit cost). ;Page 7 So the vendor or developer must analyze the relative impact of a small improvement in execution speed vs a large increase in development time and cost, which consequently translates to higher selling prices, thereby reducing the anticipated market for their product. In many cases, the tradeoffs do not merit choosing assembly language. Let us turn now to binary image size (memory size). The advantages of small programs are clear, when examining programs which are, in the DOS world, TSRs (The MAC and AMIGA worlds have similar cases, though I am not sufficiently familiar with them to know what they are called). These programs are loaded into memory, and remain there until explicitly removed, which means that the memory they use is NOT available for other uses. Device drivers similarly use memory, precluding its use for other programs, and therefore also clearly benefit from small size. In the multi-tasking world (DeskView or PC/MOS, in the PC clone market), small executables also have an advantage, permitting more programs to be run "simultaneously" in a given memory configuration, though running multi-taskers in severely restricted memory configurations probably qualifies as a technical error. What of normal, single tasking, single user environments (such as DOS, the MAC and AMIGA environments)? Besides the ego boost of creating a very small, very tight utility or application, what benefit is there in generating very small programs? They take less disk space to store, but realistically, at least under DOS, lots of very small utilities may actually not achieve a significant savings in disk space, due to granularity of storage allocation. They load a little faster, in most cases. But once again, the economics of the issue comes back to haunt us. It is not clear that the effort and expense of writing most applications in assembly language due to size considerations is an economically rational decision. The same economic pressures and considerations apply as do to the execution speed issue discussed above. On to structure. I must take issue with Patrick O'Riva regarding their purpose and nature of "structured programming." While much of his definition is true, it is incomplete, and appears to reflect a misunderstanding of certain aspects of the structured approach to programming. Firstly, it is entirely possible (and not altogether a rare occurrence) to write thoroughly unstructured code in PASCAL or C. One must take care to recognize the difference between references to a "block structured" language, as PASCAL and C both are, and "structured programming," which is really a totally separate issue. Structured programming is an approach to programming that is thoroughly applicable to whatever language a project is being implemented in. It implies firstly a step-wise refinement approach to defining the solution to a problem which the program is to address (in other words, determining the nature of the desired goal, and an at least rational approach to reaching said goal). Secondly, it ;Page 8 involves determining, to the extent possible, the nature and structure of the data that is to be processed by the program. Finally, it involves a top-down approach to the actual coding process. Just what is a top-down approach? Essentially, this means that we code the high level functionality of the program first, programming simple "do nothing" stubs for the lower levels of the program. As necessary to test the high level code, we implement lower level functions, again, if needed, programming still lower level stubs. Assuming that the structured design approach of step-wise refinement was used to begin with, the actual coding should really amount to translating the logic flow diagrams, or pseudo-code, or whatever means of recording the refinement process was used, into actual program code. In the ideal situation, the program almost literally codes itself at this point. There is a myth that "structured programming" means "goto-less" programming. In fact, this is not the case. This myth came into being through misunderstanding of the rather harsh criticism of the "go to" which occurred in the computer science journals beginning in approximate the mid to late sixties. This criticism was based primarily upon the typically excessive use of the "go to" in FORTRAN and BASIC programming at the time. Such indiscriminate use of "goto" led to what has been called "spaghetti" code, code which is virtually impossible to trace or analyze. In fact, there are many cases in programming where the goto is most structured solution available. Structured coding techniques are intended to clarify and make easier the process of analysis, design and implementation of computer programs, not to define rigid, strictly enforced rules in the face of all reason. Structured programming is ALWAYS the best approach to ANY computer program. If the internal requirements of the program, as regards speed or memory utilization, dictate the use of goto's, then use them. A properly documented GOTO can be far more "structured" than an undocumented string of modular function calls. So, back to assembly language programming. When is it appropriate to choose assembly language to implement a program? First, and most obviously, when the speed or memory utilization requirements of the application demand the capabilities that well crafted assembly language offers. Second, perhaps not so obviously, when it is necessary to work at the hardware level a great deal. High level languages, even C, do not generally manipulate hardware registers efficiently. So, if your program makes frequent or widespread use of direct hardware manipulation, it is a likely candidate for assembly language. Finally, and probably the most gratifying reason of all to choose assembly language, is when you want the satisfaction of having tackled a project in assembly and pushed the bits around to suit your purpose. There is little I can imagine that is more satisfying than to reach down into the microprocessor chip and twiddle those bits. Just be sure that you don't let your ego cloud your judgment, when the economics of the project are important (e.g., when a project is to be distributed commercially, or there is an urgent need for speedy ;Page 9 completion). I believe that all PROGRAMMERS (as opposed to casual computer users) should learn the assembly language for the machines on which they work. Besides offering the flexibility of shifting to assembly to meet a specific goal, learning assembly intimately familiarizes the programmer with the hardware on which s/he is working. The more you know about your hardware environment, the better off you are. ;Page 10 Editorial Rebuttal I thank Mr. Keller very much for his article and agree with many of the points he has made. However I must still argue the points of size and speed and justification. Whenever a program is user limited and will not be used in a multi-tasking environment as is often the case with a word processor and certain drawing programs, there may be little to be gained in assembly programming. Also there are programs which are DOS limited and little speed increase is possible. Mr. Keller uses a figure of 10 to 15 percent speed penalty. My experience indicates a value closer to 300 to 400 percent though direct comparisons are difficult to make because the same programs are usually not written in both assembly and in C. The size difference seems to be a factor of 5 to 10. The two prime examples I can offer are both by Microsoft, and it can be assumed they make use of an optimizing compiler. Their assembler is approximately 110k in size. A86 while not compatible in syntax has comparable features. It's size is 22k and assembles code in about one eighth the time. Microsoft's programmers' editor is vaguely 250k. Qedit is about 50k and is a mix of high level and assembly. You can grow gray hairs waiting for the MS editor to do a search and replace, but if you blink you'll miss it with Qedit. A fully capable full screen editor without the extras that make it a pleasure to use can easily be written in less that 5k. Give another 5k for features. What has MS gained with the extra 240k of code? David has recently completed (though they are still adding modules) a database and accounting program for a multi-office company. A much abbreviated version was threatening to overflow their 384k limit. Investigation of a Dbase implementation indicated in excess of 500k. Data base sorts used to take 10 hours. They now take 20 minutes. Savings in processing time and entry time plus increased functionality suggest a savings of $5000 to $10,000 per month PER OFFICE. Code size? 35k. Are they unhappy about the $15,000 they've been charged for a program that will get lost in a single floppy disk? Given the above examples, I must maintain that the use of high level language, when there is significant processing to be done, and when it will be used on a regular and continuing basis, benefits only the software corporation, and is detrimental to the end user. On Structured Programming I fully agree with Mr. Keller and hope that he clarified any misconceptions I left you with. I prefer a bottom up construction, but that is only preference and has no effect on the end product. Dave's notes: Mr. Keller mentions that it is possible to get great size/speed reductions, but that few programmers have the requisite skills. But to a large extent, it isn't the skill that makes the program, it's the toolbox. The C language is extremely close to assembly - MSC does a very good job of optimizing - and it takes care of the minutiae for you. The problem with this is that the libraries ;Page 11 supplied with the compilers were written to handle very general cases. The printf() function is an extreme example, but it typifies the problem: If you use printf once in your program to print "Hello", it adds 30K of code! Another concern is that many high-level-language programmers don't even realize that with a tweak here, using putc instead of printf there, they can get much(!) better performance from their programs. Familiarity with the quirks of the compiler being used is a necessity... And even that isn't enough to get good performance out of a large program. AND, it decreases portability. So you're right back into the twiddling usually associated only with assembly. I've found that if I use C for anything except flow control and one-shot tools, my programs start to get huge and slow, relative to anything that I've banged out in assembly. The database is a great example - it's a very complicated application, with a completely separated data engine & OS interface. If it had been written in C, it would be working in multiple code segments on a 286 with 4 megs and STILL take hours to run a balance, instead of 35K of code on an XT network terminal with half-hour runs. The database was indeed a massive effort, but at this point it would be possible to strip out the engine and write with ease (and macros - lots of macros) anything that could be done in C or Dbase, and do it much better. And average runtime is cut at least in half, size by 50-90%. With a reasonably solid and application-specific toolbox, the advantages TO THE CUSTOMER of assembly programming completely eclipse those of any other language and the disadvantages of assembly itself. Portability is another issue entirely. If you NEED portability and fast development, and IF run time and general productivity are not a concern, then C probably makes more sense. There's this nagging feeling, though, that if the UNIX OS core had been written in assembly by a reasonably good programmer, and been ported to new systems in kind, that the university systems would be clipping instead of slogging. As far as structured programming goes, I usually design as I go along, and end up with a functional (even rational) structure. Call it "random-access programming." This is probably because I find it difficult to call a routine until I've laid out the calling conventions for it, and while I'm doing that I'll remember another routine that should be written for another module... This is not the generally recommended method, I gather. ;Page 12 Accessing the Command Line Arguments in Assembly Language Programs By Thomas J. Keller P.O. Box 14069 Santa Rosa, CA, 95402 If you're like me, you program in several languages, under several different operating systems. Under DOS, one very useful feature is the capability to pass arguments to a program as part of the invocation command line. The use of command line arguments significantly increases the power and flexibility of your programs, as well as improving the "professional look." Many languages support this capability with intrinsic or library routines which facilitate access to these command line arguments. Assembly language, of course, does not. What is a programmer to do? As it turns out, it is quite simple to access the command line arguments under DOS. DOS places the so-called "command tail" (the command line less the actual program name) into a buffer area reserved in the PSP (Program Segment Prefix). This buffer area is known as the DTA (Disk Transfer Area). It is extremely important that you parse the command tail, if you plan to do so at all, immediately upon entering your program. DOS does some particularly obscure and insidious things with this DTA buffer, which will destroy the command tail information. In a .COM format program, the PSP is the first 100h (256) bytes of the program memory image, making access quite straightforward. How do we locate the PSP in a .EXE format program, however? Fortunately, DOS sets the ES segment register to point to the beginning of the PSP under both .COM and .EXE programs. It happens to be the case that DOS also sets all other segment registers to the same location for a .COM program, simply because .COM programs reside in one and only one segment. In an .EXE invocation, the DS and ES registers are set to point to the segment in which the PSP resides as the first 100h bytes. This is the default data segment as well. The DTA begins at offset 80h (128d) from the beginning of the PSP. When it contains a command tail, the byte at 80h contains the count of the number of bytes actually in the command tail, and the command tail string begins at offset 81h (129d) from the beginning of the PSP. The first byte of this string is always a blank (20h), and the string is terminated with a (0dh). The exact means you use to parse the command line arguments is, of course, up to you. One possible approach is as follows: 1) Use the data definition directives to set aside any memory you will need to store information about command line arguments (e.g., buffers for file names, byte or word values for flags and numeric arguments, etc.). 2) Design a routine that starts scanning the command tail string for ;Page 13 arguments. a 'first fit' (the shortest match possible) scheme is easiest to program. As each item is located and identified as to type and purpose, store the appropriate information in the data areas you have already set aside. 3) Have a "usage" message defined, and a small routine to print it to the screen (a good idea is to print it to STDERR). Invoke this routine when the first argument on the command line is a '?,' or, if the program requires arguments, when it is invoked without them. 4) You now have the switches, filenames, and other command line arguments available. Write your program to use them appropriately. Included in this issue of Assembly Language Magazine is an source listing which is a sample template GPFILT.ASM for a general purpose assembly language filter. This program provides an excellent sample of command line argument parsing and one way of using these arguments (though the method used here is not the same as the one described above). ;Page 14 Original Vector Locator by Rick Engle November, 1989 INTTEST is a small assembly program which attempts to find the original address of the INT 21h function handler. This is valuable if you need to be able to make calls to the original INT 21h function even if a TSR or other program has that interrupt hooked or trapped. This gives your program secure control over the interrupt regardless of who is using it. I did this prototype in an attempt to make certain programs somewhat immune to the effects of destructive viruses that may intercept INT 21h and use it for their own use. This technique could be used to find the original address of other MS-DOS interrupts. I wrote test programs to dump out the address of MS-DOS interrupts (such as INT 21h) and then disassembled portions of MS-DOS at those addresses to identify a stable signature of the interrupt. Then by following the chain to MS-DOS through the PSP (Program Segment Prefix) at offset 5h, I was able to find the segment:offset of the address of the handler for old CP/M calls. This pointed to the correct segment in memory of MS-DOS and from there, after moving my offset backwards about 100h in memory, I scanned for my interrupt signature. Once I got a hit, I calculated the address of the interrupt and then could make calls to INT 21h at the segment:offset found. This program is a "brute-force" method of finding the original address. If anyone finds or has a better way, I'd be very interested in hearing about it. NOTE: I have tested this program successfully on MS-DOS 2.11, 3.20, and 3.30. ~ ; ----------------------------------------------------------------------- ; INTTEST.ASM November, 1989 Rick Engle ; ; Finds the address of the INT 21h function dispatcher to ; allow the user to make INT 21h calls to the original ; interrupt regardless of who or what has INT 21h hooked. ; ; ----------------------------------------------------------------------- ; print macro print_parm push ax push dx mov ah,9 mov dx,offset print_parm int 21h pop dx pop ax endm ; ----------------------------------------------------------------------- ; - Start of program - ; ----------------------------------------------------------------------- ;Page 15 cseg segment para public 'code' assume cs:cseg,ds:cseg org 100h int_test proc far print reboot_first print int_address mov cl,21h mov ah,35h ; get interupt vector mov al,cl ; for interupt in cl int 21h ; do it mov ax,es ; lets display the es push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print colon mov ax,bx ; lets display the bx mov di,offset out_byte call conv_word print out_byte print crlf print display_header2 mov ah,byte ptr cs:[05h] ; Get info from the PSP mov al,byte ptr cs:[06h] ; push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print dash mov ah,byte ptr cs:[07h] ; mov al,byte ptr cs:[08h] ; mov di,offset out_byte call conv_word print out_byte print dash mov ah,byte ptr cs:[09h] ; mov al,byte ptr cs:[0ah] ; mov di,offset out_byte call conv_word print out_byte print crlf print display_header mov ah,byte ptr cs:[50h] ; Addess if INT 21 op code mov al,byte ptr cs:[51h] ; in the PSP ;Page 16 push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print dash mov ah,byte ptr cs:[52h] ; mov al,byte ptr cs:[53h] ; mov di,offset out_byte call conv_word print out_byte print dash mov ah,byte ptr cs:[54h] ; mov al,byte ptr cs:[55h] ; mov di,offset out_byte call conv_word print out_byte print crlf print far_address mov ax,word ptr cs:[08h] ; mov segm,ax push cs ; set es = cs pop es ; mov di,offset out_byte call conv_word print out_byte print colon mov ax,word ptr cs:[06h] ; mov off,ax push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print crlf mov ax,segm mov es,ax mov di,off inc di print function_jmp mov ax,word ptr es:[di+2] ; mov segm2,ax push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print colon mov ax,segm mov es,ax mov di,off ;Page 17 inc di mov ax,word ptr es:[di] ; mov off,ax ; save found offset of int 21h push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print crlf ;----------------------------------------------------------------- ;si = string di = string size es:bx = pointer to buffer to search ;ax = number of bytes in buffer to search. Zero flag set if found ;----------------------------------------------------------------- mov ax,segm2 mov es,ax ;segment mov bx,off ;offset sub bx,0100h ;backup a bit to catch DOS mov si,offset dos_sig ;start at modified byte mov di,dos_sig_len ;enough of a match mov ax,0300h ;# of bytes to search call search ;use our search jnz sig_not_found ;didn't find int 21h signature mov START_SEGMENT,es ;set page mov START_OFFSET,ax ;address of found string print good_address mov ax,START_SEGMENT ; push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print colon mov ax,START_OFFSET ; mov off,ax ; save found offset of int 21h push cs ; set es = cs so that pop es ; the stosb works mov di,offset out_byte call conv_word print out_byte print crlf push cs ; set es = cs pop es mov bx,START_OFFSET mov ax,START_SEGMENT mov word ptr [OLDINT21], bx mov word ptr [OLDINT21+2],ax mov dx,offset test_message mov ah,9 call dos_function jmp terminate sig_not_found: ;Page 18 print no_int21_found terminate: mov ax,4c00h ; terminate process int 21h ; and return to DOS out_byte db 'XXXX' db '$' colon db ':$' dash db '-$' crlf db 10,13,'$' reboot_first db 13,10,'INTTEST 1.0',13,10 db 'Reboot before running this, or',13,10 db 'make sure INT 21h is not hooked',13,10,13,10,'$' display_header db 'HEX data at PSP address 50h is : $' display_header2 db 'HEX data at PSP address 05h is : $' int_address db 'Original INT 21h address is : $' function_jmp db 'Jump address at DOS dispatcher : $' far_address db 'Far address of DOS dispatcher : $' good_address db 'Good INT 21h address found at : $' test_message db 13,10,10,'This message is being printed using the INT ' db '21h Interrupt',13,10 db 'Found by Brute Force!!!!',13,10,10,'$' no_int21_found db 13,10,'Int 21h address not found!$' segm dw 0 segm2 dw 0 off dw 0 START_OFFSET dw 0 ;top addr shown on screen START_SEGMENT dw 0 ;dos_sig db 08Ah, 0E1h, 0EBh ; mov ah,cl ; ; jmp short label dos_sig db 080h, 0FCh, 0F8h ; cmp ah,0F8h dos_sig_len equ $ - dos_sig OLDINT21 dd ? ; Old DOS function interrupt vector int_test endp ; ----------------------------------------------------------------------- ; - - ; - Subroutine to convert a word or byte to hex ASCII - ; - - ; - call with AX = binary value - ; - DI = address to store string - ; - - ; ----------------------------------------------------------------------- conv_word proc near push ax mov al,ah call conv_byte ; convert upper byte pop ax call conv_byte ; convert lower byte ret ; and return conv_word endp conv_byte proc near push cx ; save cx ;Page 19 sub ah,ah ; clear upper byte mov cl,16 div cl ; divide binary data by 16 call conv_ascii ; the quotient becomes the stosb ; ASCII character mov al,ah call conv_ascii ; the remainder becomes the stosb ; second ASCII character pop cx ; restore cx ret conv_byte endp conv_ascii proc near ; convert value 0-0Fh in al add al,'0' ; into a "hex ascii" character cmp al,'9' jle conv_ascii_2 ; jump if in range 0-9 add al,'A'-'9'-1 ; offset it to range A-F conv_ascii_2: ret ; return ASCII character in al conv_ascii endp ;----------------------------------------------------------------------- ; This routine does a dos function by calling the old interrupt vector ;----------------------------------------------------------------------- assume ds:nothing, es:nothing dos_function proc ; mov cl,ah ;move our function # into cl pushf ;These instructions simulate ;an interrupt cli ;turn off interrupts call CS:OLDINT21 ;Do the DOS function sti ;enable interrupts push cs pop ds push cs pop es ret dos_function endp ;----------------------------------------------------------------- ;si = string di = string size es:bx = pointer to buffer to search ;ax = number of bytes in buffer to search. Zero flag set if found ;----------------------------------------------------------------- SEARCH PROC NEAR ;si points at string PUSH BX PUSH DI PUSH SI XCHG BX,DI ;string size, ptr to data area MOV CX,AX ;# chars in segment to search BYTE_ADD: LODSB ;char for first part of search NEXT_SRCH: REPNZ SCASB ;is first char in string in buffer JNZ NOT_FOUND ;if not, no match PUSH DI ;save against cmpsb ;Page 20 PUSH SI PUSH CX LEA CX,[BX-1] ;# chars in string - 1 JCXZ ONE_CHAR ;if one char search, we have found it REP CMPSB ;otherwise compare rest of string ONE_CHAR: POP CX ;restore for next cmpsb POP SI POP DI JNZ NEXT_SRCH ;if zr = 0 then string not found NOT_FOUND: LEA AX,[DI-1] ;ptr to last first character found POP SI POP DI POP BX RET ;that's all SEARCH ENDP cseg ends end int_test ;Page 21 ~ How to call DOS from within a TSR by David O'Riva Just a few ramblings on interactions between TSRs & DOS. Cardinal rule: DON'T CALL DOS UNLESS YOU'RE SURE OF THE MACHINE STATE!!! There are a few interrupt calls and memory locations you can play with to get this information. A list & explanation of sorts is below. The reason you don't call DOS if you've interrupted the machine in the middle of DOS is that: 1. The stack is unstable as far as DOS is concerned, and you'll probably end up overwriting DOS data or going into the weeds. 2. DOS only keeps one copy of certain crucial information as it processes a disk-related request. i.e. BPB's, current sectors, FAT memory images, fun stuff like that. If you interrupt it in the middle, ask for something different, then go back, you will probably destroy your disk, possibly beyond recall. 3. DOS simply was not designed to be re-entrant. The first 9 or 10 function calls are cool most of the time, the rest are strictly single-processing-stream functions. However, there is hope. And, (extra bonus) it happens to be compatible with most true MS-DOS releases, and many, many brand-name DOSes. As well as most clones. What you need to do is after determining that the user wants to pop your program up, you set a few flags. One of them prevents your program from being popped up AGAIN while the current DOS call is completing, and the other tells a timer trap routine to start looking for DOS to finish it's current process (usually a matter of split seconds). When the timer routine detects that DOS is no longer active, it grabs control of the system and runs your TSR. At this point, all DOS calls are as safe as they are for a normal application. What follows is an outline of the code necessary to activate a TSR that uses DOS calls. Depending on the TSR, other things may need to be done in these routines as well. Definitely make sure you understand the interactions of the various routines before TSRing your background disk formatter. Okay, nitty-gritty time... ;Page 22 You need 5 main chunks of code to do this right: a) a bit of extra initialization code b) your TSR's main program c) activation request server (usually a keypress trap) d) timer tick inDOS monitor e) DOS busy loop monitor And here's what they do: a) asks DOS for the location of the inDOS flag, and stores that away. b) does whatever you want it to. c) when the activation requirement is sensed (the user pressed the hot-key, the print buffer is empty, the modem is sending another packet, whatever) the following steps need to be taken: 1. have we already tried to activate, and are waiting for DOS to finish? if so, then ignore the activation request. 2. check the inDOS flag. if we're not in DOS, then activate as usual. 3. set a flag indicating that the TSR wants to activate, but can't right now 4. return to DOS d) this is linked in AFTER interrupt 08 - that is, when this interrupt happens, call the original INT 08 handler, then run your checking code: 1. does the TSR want to run? if not, return from the interrupt. 2. check the inDOS flag. If it's out of DOS, then run your code as normal 3. return from the interrupt * * NOTE: This code has to run FAST. If it's poorly coded, you may very well see downgraded performance of the entire system. e) link in to the DOS keyboard busy loop - INT 28. This interrupt is called when DOS is waiting for a keystroke via functions 1,3,7,8,0A, and 0C. If the TSR takes control from this loop, then DOS functions ABOVE 0C are safe to use. Functions 0 - 0C are NOT safe to use. ;Page 23 1. Does the TSR want to run? if not, continue down the interrupt chain. 2. run the TSR as usual 3. continue down the interrupt chain. NOTES: The first action your main TSR code should take is to clear the flag that indicates the TSR is trying to run. If this is not done, your TSR will re-enter itself at least 18.2 times per second... i.e. a MESS. The last action your main TSR code should take before leaving is to RESET the flags that prevent the TSR from being activated. If you forget to do this, your TSR will run once, then never again... I know from personal experience that this is frustrating to a dangerous degree. Some of this code is really complicated, so don't get discouraged if it takes a few days of tweaking and hair-pulling to get it right. All numbers in this text are in hex. The timer tick routine is really touchy, at least the way I wrote it. Be very sure yours is reliable if you distribute a program with this structure. The reason that functions 0-0C are separated from the rest of the DOS calls as far as re-entrancy is concerned is that they use an entirely separate stack frame. I believe this must have been done specifically for the purpose of helping TSR writers. Does anyone know why the hell Microsoft built these neat functions into DOS and then refused to acknowledge their existence? INTERRUPT & FUNCTION CALLS INT 08 Timer tick interrupt. Called 18.2 times a second on IRQ 0. The interrupt is triggered by timer 0. INT 21, FUNCTION 34 inDOS flag address request. This function returns the address of the "inDOS flag" as a 32 bit pointer in ES:BX. The inDOS flag is a byte that is zero when DOS is not processing a function request, and is non-zero when DOS is in a function. ;Page 24 NOTE: This function is officially specified as RESERVED. It's use could change in future versions of DOS, and it can only be guaranteed to work in straight IBM PC-DOS or MS-DOS versions 2.0 to 3.30. Use at your own risk. INT 28 DOS keyboard busy loop. This interrupt is called when DOS is waiting for a keystroke in the console input functions. When this interrupt is issued, it is safe to use any DOS call ABOVE 0C. Calls to DOS functions 0 - 0C will trash the stack and do nasty things. NOTE: This function is officially RESERVED. See the note for function 34 above. AUTHOR'S NOTE: First, the references I listed are really great. They've helped me out a lot over the past few years. Second, if your hard disk gets munched by your TSR, read the disclaimer. ! ! ! ! ! ! C A V E A T P R O G R A M M E R ! ! ! ! ! ! & Disclaimer The techniques described in here are, for the most part, UNDOCUMENTED by Microsoft or IBM. This means that you CAN NOT BE SURE that they will work on all IBM clones, and could even cause crashes on some! The timer tick interrupt provides some essential system services, and messing with it incautiously can wreak havoc. The program outlines presented here are what worked for me on my system, and what should work on about 90% of the clones out there. However, I still suggest that you find a reference for all of the interrupts and functions described here. This file is meant to be a guideline and aid only. REFERENCES: DOS Programmer's Reference, by Terry R. Dettmann. $22.95, QUE Corporation IBM DOS Technical Reference, version 3.30 $(?), International Business Machines Corp. I can't remember how much it cost... ;Page 25 Environment Variable Processor by David O'Riva ~ PAGE 60,132 TITLE Q43.ASM - editor prelude & display manager ; ; COMMENT~*********************************************************************** * ---===> All code in this file is copyright 1989 by David O'Riva <===--- * ******************************************************************************* * * * The above line is there only to prevent people (or COMPANIIES) from * * claiming original authorship of this code and suing me for using it. * * You're welcome to use it anyhow you care to. * * * * * Environment Variable Finder & Processor - * * * * The "get_environment_variable" routine is complete in itself, and can * * be extracted and used in anything else that needs one. Just copy the entire * routine, from the header to the endp (don't forget the RADIX and DW). * Theroutine currently uses 315 (decimal) bytes. * * * This program's purposeis to invoke an editor (or any program, really, * with a specific machine state depending on environment variables. (Yeah!!!) * Currently it is set up to change my screen to one of various modes, with * the variable ED_SCRMODE being set to: * 100/75 = 100 columns by 75 lines * 132/44 = 132 by 44 * 80/44 = 80 by 44 * ...and then to EXEC my editor (qedit) with that mode set. You could * set the screen back to the standard 80x25 after the EXEC returns. * * Note: The 80/44 set code should work on most (ar all?) EGAs. The * other two high-res text modes use built-in extended BIOS modes in my * Everex EV-657 EGA card (the 800x600 version) w/multisync monitor. If you've * got one of those, you're in luck - no mods needed. It will also work on the * EV-673 EVGA card w/appropriate monitor. * * * Note to BEGINNERS: This is not an example of "good" asm code. This * file is an example of what happens when you're up at 1:00am with too much * coffee and a utility that needs to be fixed. * * * * This is a COM program, not an EXE. Remember to use EXE2BIN. * * * ******************************************************************************~ ; ; TRUE EQU 0FFH FALSE EQU 0 ; ;****************************************************************************** ; CODE SEGMENT PARA PUBLIC 'CODE' ASSUME CS:CODE,DS:CODE,ES:CODE,SS:CODE ; MAIN PROC NEAR ;Page 26 ORG 100H entry: ;------------------------------------------------------------------------------ ; set the screen to the correct mode ;------------------------------------------------------------------------------ call set_screen_mode ;------------------------------------------------------------------------------ ; check for pathname change in environment ;------------------------------------------------------------------------------ call set_exec_name ;------------------------------------------------------------------------------ ; setup memory and run the program ;------------------------------------------------------------------------------ MOV BX,OFFSET ENDRESIDENT ;deallocate unnecessary memory MOV CL,4 SHR BX,CL INC BX MOV AH,04AH INT 021H MOV AX,CS ;exec the program MOV INSERT_CS1,AX MOV INSERT_CS2,AX MOV INSERT_CS3,AX MOV AX,04B00H MOV BX,OFFSET EXECPARMS MOV DX,OFFSET PROGNAME INT 021H ;------------------------------------------------------------------------------ ; clean up and leave ;------------------------------------------------------------------------------ MOV AH,04DH ;get return code from program INT 021H MOV AH,04CH ;leave INT 021H ; ;****************************************************************************** ; ; data ; PROGNAME DB 'F:\UTILITY\MISC\Q.EXE',0 db 100 dup(' ') EXECPARMS DW 0 ;use current environment DW 080H ;use current command tail INSERT_CS1 DW ? DW 05CH ;use current FCB's INSERT_CS2 DW ? DW 06CH INSERT_CS3 DW ? ;Page 27 ENDRESIDENT: ;****************************************************************************** ; more data - used only for setup & checks ; valid_modes db '80/44 ' db '132/44' db '100/75' screen_mode db ' ' mode_jump dw goto_43 dw goto_132 dw goto_100 ev_mode db 'ED_SCRMODE',0 ev_pathname db 'ED_PATH',0 PAGE ;****************************************************************************** ; set_screen_mode - ; ; ; ENTRY: ; ; EXIT: ; ; DESTROYED: ; ;------------------------------------------------------------------------------ set_screen_mode: MOV AH,012H ;check for presence of EGA/VGA MOV BL,010H INT 010H CMP BL,010H ;BL changed? (should have # of ; bytes of EGA memory) JE ssm_no_ega ;This is no EGA! ;------------------------------------------------------------------------------ ; check environment for correct mode set - ; don't set mode if none specified ;------------------------------------------------------------------------------ mov si,offset ev_mode mov di,offset screen_mode mov cx,6 ;accept 6 chars mov ax,4 ;get fixed-length string call get_environment_variable and ax,0feh jne ssm_no_env_mode ;------------------------------------------------------------------------------ ; look up the variable's value in my mode table ;------------------------------------------------------------------------------ mov bx,0 mov di,offset valid_modes ;Page 28 ssm_check_mode: mov dx,di mov si,offset screen_mode mov cx,6 repe cmpsb je ssm_found_mode mov di,dx add di,6 inc bx cmp bx,3 jne ssm_check_mode jmp ssm_bad_mode ;------------------------------------------------------------------------------ ; set the correct screen mode ;------------------------------------------------------------------------------ ssm_found_mode: shl bx,1 jmp mode_jump[bx] goto_100: mov ax,0070h mov bx,8 int 010h jmp ssm_leave goto_132: mov ax,0070h mov bx,0bh int 010h jmp ssm_leave goto_43: MOV AX,3 INT 010H MOV AX,01112H ;set to 8x8 chars (43/50 lines) MOV BL,0 INT 010H ssm_no_env_mode: ssm_bad_mode: ssm_no_ega: ssm_leave: ret PAGE ;****************************************************************************** ; set_exec_name - ; ; ; ENTRY: ; ; EXIT: ; ; DESTROYED: ; ;------------------------------------------------------------------------------ set_exec_name: ; ; If you want, write a chunk here that will read an alternate pathname ; for the editor to be executed from a different variable (like ED_PATH) ;Page 29 ; I was going to do it, but ran out of time and need. (My editor never wanders ; around!) ; ret PAGE ;****************************************************************************** ; Get_environment_variable - ; ; ; ; ; ENTRY: ds:[si] -> ASCIIZ environment variable name ; ds:[di] -> (up to) 129 byte buffer for string ; es = segment of program's PSP ; cx = maximum # of characters to accept ; al = variable return format ; 0 - return string in ASCIIZ format ; xxxxx 0 ........ ; ; 1 - return string in DOS string ('$' terminated) format ; xxxxxxxx $ ........ ; ; 2 - return string in DOS input buffer format ; maxchrs,numchrs,xxxxxxxx CR ............ ; ; 3 - return string in command tail format ; numchrs,xxxxxxxxxxx CR .......... ; ; 4 - return string in fixed-length (CX chars) format ; xxxxxx ; ; EXIT: al = return codes: ; bit 0 - if set, string was longer than max, truncated ; 1 - if set, string did not exist ; 2 - if set, invalid return format requested ; ; DESTROYED: ah is undefined ; ;------------------------------------------------------------------------------ .RADIX 010h gev_flags dw ? Get_environment_variable: push bx push cx push dx push si ;Page 30 push di push es mov cs:gev_flags,ax mov es,es:[02c] ;es -> program's environment ;------------------------------------------------------------------------------ ; make sure the environment has at least one variable in it ;------------------------------------------------------------------------------ mov ax,es:[0] cmp ax,0 jne gev_exists mov ax,2 jmp gev_leave ;------------------------------------------------------------------------------ ; find length of search string ;------------------------------------------------------------------------------ gev_exists: push cx push di mov di,si mov cx,0ffff gev_sourcelen: inc cx mov al,[di] inc di cmp al,0 jne gev_sourcelen cmp cx,0 jne gev_startfind pop di pop cx mov ax,2 jmp gev_leave ;------------------------------------------------------------------------------ ; find string ;------------------------------------------------------------------------------ gev_startfind: mov bx,cx mov dx,si mov di,0 gev_checknext: mov cx,bx mov si,dx repe cmpsb je gev_found? gev_tonextvar: mov cx,0ffff mov al,0 repne scasb cmp es:[di],al jne gev_checknext mov ax,2 pop di pop cx jmp gev_leave ;Page 31 gev_found?: cmp byte ptr es:[di],'=' jne gev_tonextvar ;------------------------------------------------------------------------------ ; found the string in the environment ;------------------------------------------------------------------------------ gev_found: inc di mov si,di pop di pop cx cmp cs:gev_flags,1 ja gev_ibufform ;------------------------------------------------------------------------------ ; move normal string with 0 or $ terminator ;------------------------------------------------------------------------------ gev_nextchar0: mov al,es:[si] cmp al,0 je gev_setterm0 mov ds:[di],al inc si inc di dec cx jne gev_nextchar0 mov al,es:[si] cmp al,0 je gev_setterm0 mov al,1 gev_setterm0: cmp cs:gev_flags,0 jne gev_setterm1 mov byte ptr ds:[di],0 ;ASCIIZ string jmp gev_leave gev_setterm1: mov byte ptr ds:[di],'$' ;DOS string jmp gev_leave ;------------------------------------------------------------------------------ ; move string into DOS input buffer format (int 21 function 0A) ;------------------------------------------------------------------------------ gev_ibufform: cmp cs:gev_flags,2 jne gev_ctailform mov ds:[di],cl ;set max length inc di mov bx,di inc di mov dx,0 gev_nextchar2: mov al,es:[si] cmp al,0 je gev_setterm2 mov ds:[di],al inc si inc di inc dx dec cx jne gev_nextchar2 mov al,es:[si] cmp al,0 ;Page 32 je gev_setterm2 mov al,1 gev_setterm2: mov byte ptr ds:[di],0d ;add carriage return mov ds:[bx],dl ;set actual # of chars jmp gev_leave ;------------------------------------------------------------------------------ ; move string into command tail format ;------------------------------------------------------------------------------ gev_ctailform: cmp cs:gev_flags,3 jne gev_fixedform mov bx,di inc di mov dx,0 gev_nextchar3: mov al,es:[si] cmp al,0 je gev_setterm3 mov ds:[di],al inc si inc di inc dx dec cx jne gev_nextchar3 mov al,es:[si] cmp al,0 je gev_setterm3 mov al,1 gev_setterm3: mov byte ptr ds:[di],0d ;set carriage return mov ds:[bx],dl ;set # of bytes jmp gev_leave ;------------------------------------------------------------------------------ ; move string into fixed-length area (pad it out with spaces) ;------------------------------------------------------------------------------ gev_fixedform: cmp cs:gev_flags,4 jne gev_badform gev_nextchar4: mov al,es:[si] cmp al,0 je gev_padout4 mov ds:[di],al inc si inc di dec cx jne gev_nextchar4 mov al,es:[si] cmp al,0 je gev_setterm4 mov al,1 jmp gev_setterm4 gev_padout4: mov byte ptr ds:[di],' ' inc di dec cx jne gev_padout4 ;Page 33 mov al,0 gev_setterm4: jmp gev_leave gev_badform: mov ax,4 gev_leave: pop es pop di pop si pop dx pop cx pop bx ret .RADIX 00ah MAIN ENDP ; ;****************************************************************************** ; CODE ENDS ; ;****************************************************************************** ; END ENTRY ~ ;Page 34 Program Reviews Multi-Edit ver 4.00a (demo version): Reviewed by Patrick O'Riva. Multi-Edit is a high feature text editor with many word processor features. The demo version is completely functional though some of the reference material is not supplied and there are advertising screens. I consider this fully acceptable as shareware. The complete version with the macro reference library is available for 79.95 and an expanded version with a spelling checker, integrated Communication terminal and phone book is $179.95. I couldn't list all of its features here, but in addition to everything you have come to expect in a quality programming editor (multi meg files, programmable keyboard etc.) there are a number of powerful additions you might not expect. The word processor functions rival most of the specialty ones that I've tried. It won't compete with the major names for those of you who are addicted to them, but it does offer full printer support, preview file, table of contents generation, and extension keyed formatting. It will right or left justify, and supports headers and footers, and auto pagination. It contains a calculator and an Ascii table Saving the best for last: The language support is very strong. It has built in templates for many common constructs, and the assembler/compiler is invoked from within the editor with a single key. It will read the error table generated by a variety of software and with successive key presses move you to each line where an error was found. Something which I found unique is Multi-Edit's help system. It is a hypertext system, and is wonderfully context sensitive most everywhere in the system. From the Help menu it has a complete table of contents and index. It is also fully user extendible. I have integrated a database I have documenting the full set of interrupts that totals about 400k and the documentation on my spelling checker as well (which integrated into Multi-Edit almost seamlessly). In many ways this is the best editor I've ever used, but it does have a few faults, some of which are very subtle and may not even be problems to most users. It is a 'tad' slower that what I'm used to with Qedit. This is seldom noticed except in the execution of complex macros. It is quite slow in paging through long files. There are some true bugs in this version such as a crash of the program (but not the data or the system) when large deletes from large files are made. Multi-edit's treatment of file windows while very versatile is slightly different and may take some time to get used to. For all of its advantages, until putting this Magazine together, I still found myself reverting to Qedit for the speed and ease of use. It is the first software that has made this anything other than an exercise in frustration. ;Page 35 SHEZ Just a quick mention because it isn't programming related. Shez is a compression shell along the lines of ArcMaster and Qfiler. It is a fine and versatile piece of programming, supporting all common compression types. The more recent versions have virus detection when used with the SCANV programs by John McAfee. 4DOS This is a program that is an absolute joy to use. It is a complete and virtually 100% compatible replacement for Command.com. The code size is just slightly larger than MSDOS 3.3 command.com but the added and enhanced functions save many times that amount in TSR's you no longer need to install. Just to mention a few features: An alias command whereby you can assign whatever mnemonic you wish to a command or string of commands. Select is a screen interface that allows you to mark files for use with a command. Except will execute a command for a set of files excluding one or more. There is an environment editor, built in Help, command and filename completion, Global that will execute through the directory tree, A Timer to keep track of elapsed time, as well as many enhanced batch file commands. Additional features are too numerous to mention. The current version is 4.23 and is available as Shareware, but you should register after your first 10 minutes of use. You will be hooked forever. The above 3 programs should all be available on your local BBS's. Please be sure and register programs you use. ;Page 36 Book Reviews Assembly Language Quick Reference by Allen L. Wyatt, Sr. Reviewed by George A. Stanislav This 1989 book published by QUE is a nice and handy reference for assembly language programmers. Instruction sets for six microprocessors and numeric coprocessors are listed: 8086/8088 8087 80286 80287 80386 80387 I could find no reference to the 80186 microprocessor, not even a suggestion that it uses the 80286 instruction set but does not multitask. Because the 80186 was the brain of Tandy 2000, quite a popular computer in its own time, its omission from the book is surprising. There is no division into chapters. This makes it somewhat hard to figure out where the instruction sets of individual processors start. Each higher processor set contains only the list of instructions that are new for the processor or that changed somewhat. After a brief introduction, the book starts by listing, alphabetically, all 8086/8088 instructions. The listing itself is very well done. Each instruction stands out graphically from the rest of the text. For every code there is some classification, e.g. arithmetic, bit manipulation, data-transfer. This is followed by a very brief description ended with a colon. Next, a more detailed explanation gives sufficient information to any assembly language programmer what the instruction does. If applicable, the book lists flags affected by the instruction. Most instructions also contain some coding examples. The 8086/8088 instruction set is followed by the 80286 set, or rather subset as it only contains the instructions new to this microprocessor. Similarly, the 80386 section contains only those instructions not found in the 8086/8088 and 80286 sections as well as those that changed somewhat. I find it puzzling that among those instructions considered changed in the 80386 microprocessor we can find AND, NEG, POP - because they can be used as 32-bit instructions in addition to their original usage - but cannot find JE, JNE, and all other conditional jumps. These did indeed change in the 80386 processor inasmuch they can be used either as SHORT or as NEAR while on the older microprocessors they could only jump within the SHORT range. ;Page 37 The rest of the book contains instructions for the math coprocessors, the 8087, 80287 and 80387. This section is divided in the same way as the microprocessor part, i.e. describing first the 8087 set, then the one new instruction for the 80286, followed by the new 80387 instructions. There are several possibilities of improvement QUE might consider for future editions of this book: o Make it easier to find the start of each section by color coding the side of the paper; o Include references to the instructions of the older processors within the listing for the new processors. Small print of the instruction with the page number where a more detailed description can be found would be a nice enhancement; o At least a brief mention of the 80186 microprocessor and perhaps the V-20 and V-30 would be useful. Despite the possibility of improvement, this is an excellent reference for any assembly language programmer. Its small size makes it very handy to keep it next to the computer as well as to take it along when travelling. The book costs $6.95 in USA and $8.95 in Canada. ;Page 38 GPFILT.ASM ооооооооооооооооооооооооооооооооооооооооо ~ page ,132 TITLE GPFILT subttl General Purpose Filter Template ; ; GPFILT.ASM ; This file contains a template for a general-purpose assembly language ; filter program. ; ; Fill in the blanks for what you wish to do. The program is set up to ; accept a command line in the form: ; COMMAND [{-|/}options] [infile [outfile]] ; ; If infile is not specified, stdin is used. ; If outfile is not specified, stdout is used. ; ; To compile and link: ; MASM GPFILT ; ; LINK GPFILT ; ; EXE2BIN GPFILT GPFILT.COM ; ; Standard routines supplied in the general shell are: ; ; get_arg - returns the address of the next command line argument in ; DX. Since this is a .COM file, the routine assumes DS will ; be the same as the command line segment. ; The routine will return with Carry set when it reaches the end ; of the command line. ; ; err_msg - displays an ASCIIZ string on the STDERR device. Call with the ; address of the string in ES:DX. ; ; do_usage- displays the usage message on the STDERR device and exits ; with an error condition (errorlevel 1). This routine will ; never return. ; ; getch - returns the next character from the input stream in AL. ; It will return with carry set if an error occurs during read. ; It will return with the ZF set at end of file. ; ; putch - writes a character from AL to the output stream. Returns with ; carry set if a write error occurs. ; cseg segment assume cs:cseg, ds:cseg, es:cseg, ss:cseg org 0100h ;for .COM files start: jmp main ;jump around data area ; ; Equates and global data area. ; ; The following equates and data areas are required by the general filter ; routines. User data area follows. ; ;Page 39 STDIN equ 0 STDOUT equ 1 STDERR equ 2 STDPRN equ 3 cr equ 0dh lf equ 0ah space equ 32 tab equ 9 infile dw STDIN ;default input file is stdin outfile dw STDOUT ;default output file is stdout errfile dw STDERR ;default error file is stderr prnfile dw STDPRN ;default print file is stdprn cmd_ptr dw 0081h ;address of first byte of command tail PSP_ENV equ 002ch ;The segment address of the environment ;block is stored here. infile_err db cr, lf, 'Error opening input file', 0 outfile_err db cr, lf, 'Error opening output file', 0 aborted db 07, cr, lf, 'Program aborted', 0 usage db cr, lf, 'Usage: ', 0 crlf db cr, lf, 0 ;************************************************************************ ;* * ;* Buffer sizes for input and output files. The buffers need not be * ;* the same size. For example, a program that removes tabs from a text * ;* file will output more characters than it reads. Therefore, the * ;* output buffer should be slightly larger than the input buffer. In * ;* general, the larger the buffer, the faster the program will run. * ;* * ;* The only restriction here is that the combined size of the buffers * ;* plus the program code and data size cannot exceed 64K. * ;* * ;* The easiest way to determine maximum available buffer memory is to * ;* assemble the program with minimum buffer sizes and examine the value * ;* of the endcode variable at the end of the program. Subtracting this * ;* value from 65,536 will give you the total buffer memory available. * ;* * ;************************************************************************ ; INNBUF_SIZE equ 31 ;size of input buffer (in K) OUTBUF_SIZE equ 31 ;size of output buffer (in K) ; ;************************************************************************ ;* * ;* Data definitions for input and output buffers. DO NOT modify these * ;* definitions unless you know exactly what it is you're doing! * ;* * ;************************************************************************ ; ; Input buffer ibfsz equ 1024*INNBUF_SIZE ;input buffer size in bytes inbuf equ endcode ;input buffer ibfend equ inbuf + ibfsz ;end of input buffer ; ;Page 40 ; ibfptr is initialized to point past end of input buffer so that the first ; call to getch will result in a read from the file. ; ibfptr dw inbuf+ibfsz ; output buffer obfsz equ 1024*OUTBUF_SIZE ;output buffer size in bytes outbuf equ ibfend ;output buffer obfend equ outbuf + obfsz ;end of output buffer obfptr dw outbuf ;start at beginning of buffer ;************************************************************************ ;* * ;* USER DATA AREA * ;* * ;* Insert any data declarations specific to your program here. * ;* * ;* NOTE: The prog_name, use_msg, and use_msg1 variables MUST be * ;* defined. * ;* * ;************************************************************************ ; ; This is the program name. Under DOS 3.x, this is not used because we ; can get the program name from the environment. Prior to 3.0, this ; information is not supplied by the OS. ; prog_name db 'GPFILT', 0 ; ; This is the usage message. The first two lines are required. ; The first line is the programs title line. ; Make sure to include the 0 at the end of the first line!! ; The second line shows the syntax of the program. ; Following lines (which are optional), are discussion of options, features, ; etc... ; The message MUST be terminated by a 0. ; use_msg db ' - General Purpose FILTer program.', cr, lf, 0 use_msg1 label byte db '[{-|/}options] [infile [outfile]]', cr, lf db cr, lf db 'If infile is not specified, STDIN is used', cr, lf db 'If outfile is not specified, STDOUT is used', cr, lf db 0 ; ;************************************************************************ ;* * ;* The main routine parses the command line arguments, opens files, and * ;* does other initialization tasks before calling the filter procedure * ;* to do the actual work. * ;* For a large number of filter programs, this routine will not need to * ;* be modified. Options are parsed in the get_options proc., and the * ;* filter proc. does all of the 'filter' work. * ;* * ;************************************************************************ ; main: cld call get_options ;process options ;Page 41 jc gofilter ;carry indicates end of arg list mov ah,3dh ;open file mov al,0 ;read access int 21h ;open the file mov word ptr ds:[infile], ax ;save file handle jnc main1 ;carry clear indicates success mov dx,offset infile_err jmp short err_exit main1: call get_arg ;get cmd line arg in DX jc gofilter ;carry indicates end of arg list mov ah,3ch ;create file mov cx,0 ;normal file int 21h ;open the file mov word ptr ds:[outfile],ax ;save file handle jnc gofilter ;carry clear indicates success mov dx,offset outfile_err jmp short err_exit gofilter: call filter ;do the work jc err_exit ;exit immediately on error mov ah,3eh mov bx,word ptr [infile] int 21h ;close input file mov ah,3eh mov bx,word ptr [outfile] int 21h ;close output file mov ax,4c00h int 21h ;exit with no error err_exit: call err_msg ;output error message mov dx,offset aborted call err_msg mov ax,4c01h int 21h ;and exit with error ; ;************************************************************************ ;* * ;* get_options processes any command line options. Options are * ;* preceeded by either - or /. There is a lot of flexibility here. * ;* Options can be specified separately, or as a group. For example, * ;* the command "GPFILT -x -y -z" is equivalent to "GPFILT -xyz". * ;* * ;* This routine MUST return the address of the next argument in DX or * ;* carry flag set if there are no more options. In other words, return * ;* what was returned by the last call to get_arg. * ;* * ;************************************************************************ ; get_options proc call get_arg ;get command line arg jnc opt1 ; If at least one argument is required, use this line ; call do_usage ;displays usage msg and exits ; If there are no required args, use this line ret ;if no args, just return opt1: mov di, dx mov al,byte ptr ds:[di] ;Page 42 cmp al,'-' ;if first character of arg is '-' jz opt_parse cmp al,'/' ;or '/', then get options jz opt_parse ret ;otherwise exit opt_parse: inc di mov al,byte ptr ds:[di] or al,al ;if end of options string jz nxt_opt ;get cmd. line arg cmp al,'?' ;question means show usage info jz do_usage ; ;************************************************************************ ;* * ;* Code for processing other options goes here. The current option * ;* character is in AL, and the remainder of the option string is pointed* ;* to by DS:DI. * ;* * ;************************************************************************ ; jmp short opt_parse nxt_opt: call get_arg ;get next command line arg jnc opt1 ;if carry vld_args: ;then validate arguments ; ;************************************************************************ ;* * ;* Validate arguments. If some options are mutually exclusive/dependent* ;* use this area to validate them. Whatever the case, if you must * ;* abort the program, call the do_usage procedure to display the usage * ;* message and exit the program. * ;* * ;************************************************************************ ; ret ; no more options ; ;************************************************************************ ;* * ;* Filter does all the work. Modify this routine to do what it is you * ;* need done. * ;* * ;************************************************************************ ; filter proc call getch ;get a character from input into AL jbe filt_done ;exit on error or EOF and al, 7fh ;strip the high bit call putch ;and output it jc filt_ret ;exit on error jmp short filter filt_done: jc filt_ret ;carry set is error call write_buffer ;output what remains of the buffer filt_ret: ;Page 43 ret filter endp ; ;************************************************************************ ;* * ;* Put any program-specific routines here * ;* * ;************************************************************************ ; ;************************************************************************ ;* * ;* For most programs, nothing beyond here should require modification. * ;* The routines that follow are standard routines used by almost every * ;* filter program. * ;* * ;************************************************************************ ; ;************************************************************************ ;* * ;* This routine outputs the usage message to the STDERR device and * ;* aborts the program with an error code. A little processing is done * ;* here to get the program name and format the output. * ;* * ;************************************************************************ ; do_usage: mov dx, offset crlf call err_msg ;output newline mov ah,30h ;get DOS version number int 21h sub al,3 ;check for version 3.x jc lt3 ;if carry, earlier than 3.0 ; ; For DOS 3.0 and later the full pathname of the file used to load this ; program is stored at the end of the environment block. We first scan ; all of the environment strings in order to find the end of the env, then ; scan the load pathname looking for the file name. ; push es mov ax, word ptr ds:[PSP_ENV] mov es, ax ;ES is environment segment address mov di, 0 mov cx, 0ffffh ;this ought to be enuf xor ax, ax getvar: scasb ;get char jz end_env ;end of environment gv1: repnz scasb ;look for end of variable jmp short getvar ;and loop 'till end of environment end_env: inc di inc di ;bump past word count ; ; ES:DI is now pointing to the beginning of the pathname used to load the ; program. We will now scan the filename looking for the last path specifier ; and use THAT address to output the program name. The program name is ; output WITHOUT the extension. ;Page 44 ; mov dx, di fnloop: mov al, byte ptr es:[di] or al, al ;if end of name jz do30 ;then output it inc di cmp al, '\' ;if path specifier jz updp ;then update path pointer cmp al, '.' ;if '.' jnz fnloop mov byte ptr es:[di-1], 0 ;then place a 0 so we don't get ext jmp short fnloop ; when outputting prog name updp: mov dx, di ;store jmp short fnloop ; ; ES:DX now points to the filename of the program loaded (without extension). ; Output the program name and then go on with rest of usage message. ; do30: call err_msg ;output program name pop es ;restore jmp short gopt3 ; ; We arrive here if the current DOS version is earlier than 3.0. Since the ; loaded program name is not available from the OS, we'll output the name ; entered in the 'prog_name' field above. ; lt3: mov dx, offset prog_name call err_msg ;output the program name ; ; After outputting program name, we arrive here to output the rest of the ; usage message. This code assumes that the usage message has been ; written as specified in the data area. ; gopt3: mov dx, offset use_msg call err_msg ;output the message mov dx, offset usage call err_msg mov dx, offset use_msg1 call err_msg mov ax,4c01h int 21h ;and exit with error get_options endp ; ;************************************************************************ ;* * ;* Output a message (ASCIIZ string) to the standard error device. * ;* Call with address of error message in ES:DX. * ;* * ;************************************************************************ ; err_msg proc cld mov di,dx ;string address in di mov cx,0ffffh xor ax,ax repnz scasb ;find end of string ;Page 45 xor cx,0ffffh dec cx ;CX is string length push ds mov ax,es mov ds,ax ;DS is segment address mov ah,40h mov bx,word ptr cs:[errfile] int 21h ;output message pop ds ret err_msg endp ; ;************************************************************************ ;* * ;* getch returns the next character from the file in AL. * ;* Returns carry = 1 on error * ;* ZF = 1 on EOF * ;* Upon exit, if either Carry or ZF is set, the contents of AL is * ;* undefined. * ;* * ;************************************************************************ ; ; Local variables used by the getch proc. eof db 0 ;set to 1 when EOF reached in read last_ch dw ibfend ;pointer to last char in buffer getch proc mov si,word ptr ds:[ibfptr] ;get input buffer pointer cmp si,word ptr ds:[last_ch];if not at end of buffer jz getch_eob getch1: lodsb ;character in AL mov word ptr ds:[ibfptr],si ;save buffer pointer or ah,1 ;will clear Z flag ret ;and done getch_eob: ;end of buffer processing cmp byte ptr ds:[eof], 1 ;end of file? jnz getch_read ;nope, read file into buffer getch_eof: xor ax, ax ;set Z to indicate EOF ret ;and return getch_read: ; Read the next buffer full from the file. mov ah,3fh ;read file function mov bx,word ptr ds:[infile] ;input file handle mov cx,ibfsz ;#characters to read mov dx,offset inbuf ;read into here int 21h ;DOS'll do it for us jc read_err ;Carry means error or ax,ax ;If AX is zero, jz getch_eof ;we've reached end-of-file add ax,offset inbuf mov word ptr ds:[last_ch],ax;and save it mov si,offset inbuf jmp short getch1 ;and finish processing character ;Page 46 read_err: ;return with error and... mov dx,offset read_err_msg ; DX pointing to error message string ret read_err_msg db 'Read error', cr, lf, 0 getch endp ; ;************************************************************************ ;* * ;* putch writes the character passed in AL to the output file. * ;* Returns carry set on error. The character in AL is retained. * ;* * ;************************************************************************ ; putch proc mov di,word ptr ds:[obfptr] ;get output buffer pointer stosb ;save the character mov word ptr ds:[obfptr],di ;and update buffer pointer cmp di,offset obfend ;if buffer pointer == buff end clc jnz putch_ret push ax call write_buffer ;then we've got to write the buffer pop ax putch_ret: ret putch endp ; ;************************************************************************ ;* * ;* write_buffer writes the output buffer to the output file. * ;* This routine should not be called except by the putch proc. and at * ;* the end of all processing (as demonstrated in the filter proc). * ;* * ;************************************************************************ ; write_buffer proc ;write buffer to output file mov ah, 40h ;write to file function mov bx, word ptr ds:[outfile];output file handle mov cx, word ptr ds:[obfptr] sub cx, offset outbuf ;compute #bytes to write mov dx, offset outbuf ;from this buffer int 21h ;DOS'll do it jc write_err ;carry is error or ax,ax ;return value of zero jz putch_full ;indicates disk full mov word ptr ds:[obfptr],offset outbuf clc ret putch_full: ;disk is full mov dx,offset disk_full stc ;exit with error ret write_err: ;error occured during write ;Page 47 mov dx,offset write_err_msg stc ;return with error ret write_err_msg db 'Write error', cr, lf, 0 disk_full db 'Disk full', cr, lf, 0 write_buffer endp ; ;************************************************************************ ;* * ;* get_arg - Returns the address of the next command line argument in * ;* DX. The argument is in the form of an ASCIIZ string. * ;* Returns Carry = 1 if no more command line arguments. * ;* Upon exit, if Carry is set, the contents of DX is undefined. * ;* * ;************************************************************************ ; get_arg proc mov si,word ptr [cmd_ptr] skip_space: ;scan over leading spaces and commas lodsb cmp al,0 ;if we get a null jz sk0 cmp al,cr ;or a CR, jnz sk1 sk0: stc ;set carry to indicate failure ret ;and exit sk1: cmp al,space jz skip_space ;loop until no more spaces cmp al,',' jz skip_space ;or commas cmp al,tab jz skip_space ;or tabs mov dx,si ;start of argument dec dx get_arg1: lodsb ;get next character cmp al,cr ;argument seperators are CR, jz get_arg2 cmp al,space ;space, jz get_arg2 cmp al,',' ;comma, jz get_arg2 cmp al,tab ;and tab jnz get_arg1 get_arg2: mov byte ptr ds:[si-1], 0 ;delimit argument with 0 cmp al, cr ;if char is CR then we've reached jnz ga2 ; the end of the argument list dec si ga2: mov word ptr ds:[cmd_ptr], si ;save for next time 'round ret ;and return get_arg endp ;Page 48 endcode equ $ cseg ends end start оооооооооо ~ ;Page 49